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(57) Abstract: A distributed network management system (10) and method of operation. The system (10) includes at least one hub 

jH server (12) and at least one remote server (16), where the hub server (12) and the remote server (16) communicate with each other. 
The remote server (16) additionally communicates with and monitors one or more netwoiic devices (20). In the event that the remote 
server (16) becomes inoperational, the hub server (12) assumes monitoring of the nietwoik device (20). For xedundancy, primaty (12) 
and secondary (14) hub servers can be provided, wherein the primacy (12) and secondary (14) hub servers conununicate with each 

. ^ other and are capable of communicating with the remote server (16). For feather redundancy, primary (16) and secondary (18) remote 
servers cah be provided, wherein the primary (16) and secondaiy (18) remote servers conmiunicate with each other but independently 
monitor the network devices (20). In the peered remote configuration, the hub server (12) is capable of conmiunicating with either 

^ of the remote servers (16, 18). Where both the hub servers (12, 14) and the remote servers (16,18) are peered, each hub server (12, 

1^ 14) is capable of conmiunicating with each remote server (16, 18). 



BNSIXXJID: <WO_oa0321 1A1J_> 



wo 02/03211 PCT/USOO/23728 
DISTRIBUTED NETWORK MANAGEMENT SYSTEM AND METHOD 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

5 This invention pertains generally to network communications, and more particularly 

to monitoring and managing network performance. 

2. Description of the Background Art 

In the operation of interconnected networks, it is often desirable to have a mechanism 
for monitoring the state of equipment and devices in the network. Traditionally, this has been 

1 0 accomplished using a centrally-based network management system, with a plurality of 
individual network management systems feeding up to the central network management 
system in a conventional tree hi^aichy . Eqmpment and devices would similarly feed up to 
the individual network management systems in a conventional tree hierarchy. Unfortunately, 
such a architecture for a netwoik management system does not scale well and does not 

1 5 provide for propagation of state and configuration information among a set of cooperating 
systems. 

BRIEF SUMMARY OF THE INVENTION 
The present invention is a scalable distributed network management system with the 
potential for full redundancy at hub and remote levels. The remotes monitor state changes of 
20 network devices, and those state changes propagate bidirectionally between hubs and 

remotes. Furthermore, configuration changes for designating the monitoring parameters of 
the remotes propagate bidirectionally between remotes and hubs. 

By way of example, and not of limitation, fiie system includes at least one hub server 
and at least one remote server, where the hub server and the xemote server communicate with 
25 each other. The remote server additionally communicates with and monitors one or more 
network devices. In tiie event timt the remote server becomes inoperational, the hub serve 
assumes monitoring of the network device(s). 

According to anotiier aspect of the invention, for redundancy, primary and secondary 
hub servers can be provided, v^dierem the primary and secondary hub servers commimicate 
30 with each other. In this peered hub configuration, if the primary hub server becomes 
inoperational and the secondary hub server is operational, the secondary hub server 
commxmicates with the remote server. Additionally, in the peered hub configuration, if both 

1 
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the primaiy hub server and the remote server are inoperational, the secondary hub server 
assumes monitoring of the network devices. 

According to another aspect of the invention, for redundancy, primary and secondary 
remote servers can be provided, wherein the primary and secondary remote servers 
5 communicate with each other but independently monitor tiie network devices. In the peered 
remote configuration, if the primary remote server becomes inoperational, the primary hub 
communicates with the secondary remote. 

According to a still further aspect of the invention, if the remotes and the hubs are 
peered and the primary hub is inoperational, the secondary hub communicates with the 

10 primary remote thereby temporarily assuming the duties of the primary hub. Also in the 

peered hub and peered remote configuration, if both the primary hub and primary remote are 
inoperational, the secondary hub communicates vnQi the secondary remote. If both remotes 
are inoperational, then all active hubs assume monitoring of the network devices. 

To facilitate monitoring of network devices, the invention derives state information 

15 firom network devices iisingvAat is referred to herem as the I^ig^ In 
LTP, a plurality of pings is sent firom an ICMP server to an inter&ce address on a network 
device during a polling interval. The number of pings returned fix>m said network device is 
recorded and converted to a percentage based on the ratio of the number of pings sent to the 
numbCT of pings received. Next, an SNMP query is sent to the network device and the 

20 operational status of the network device, such as "up", "down" or "xmknown" is determined 
fi^om the SNMP query. Using the percentage of pings returned and the SNMP status, a status 
percentage for the polling period is generated by multiplying the percentage pmgs returned by 
a constant associated with the operational status, where the constant has a first value if the 
operational status is "iq[>", a second value if tiie operational status is down", and a tibird value 

25 if the operational status is "unknown". Next, a weighted average of the status percentages for 
the current and previous four polling periods is computed. Then, the state of the network 
device is determined fi*om the weighted average. 

An object of the invention is to provide a distributed network management system 
v/bere configuration information propagates bidirectionally through the system. 

30 Another object of the invention is to provide a distributed network management 

system where configuration information can be entered at one location and propagate through 
the system. 
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Another object of the invention is to provide a distributed network management 
system which can be accessed through a web server. 

Another object of the invention is to provide a distributed network: management 
system where state changes propagate bidirectionally tfarou^ the system. 
5 Another object of the invention is to provide a i>eered distributed networic 

management system with automatic failover and resynchrordzation. 

Another object of the invention is to provide a distributed network management 
system which consolidates multiple status notifications into a single notification one based on 
an interface hierarchy. 

1 0 Another object of the invention is to provide a distributed network management 

system with a plug-in architecture of service, notification and utility modules. 

Another object of the invention is to provide a distributed network management 
system that can serve as an infoxmation transport 

Further objects and advantages of the invention will be brought out in the following 
15 portions oftbe specification, viierem the detailed description is 

disclosing preferred embodiments of the invention without placing limitations thereon. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The invention will be more fiilly understood by reference to the following drawings 
which are for illustrative purposes only: 
20 FIG. 1 is a schematic diagram of the high level architecture of an embodiment of a 

distributed network management system according to the invention depicting the primary hub 
and the primary remote as being operational, and the primary hub as communicating with the 
primary remote. 

FIG. 2 is a schematic diagram of the distributed network management system of FIG. 
25 1 depicting the primary hub as being operational, the primary remote as being inoperational, 
the secondary remote as being operational, and the primary hub communicating with the 
secondary remote. 

FIG. 3 is a schematic diagram of the distributed network management system of FIG. 
1 depictmg the primary hub as bemg inoperational, the secondary hub as being operational, 
30 the primary remote as being operational, and the secondary hub conununicating with the 
primary remote, 

FIG. 4 a schematic diagram of the distributed network management system of FIG. 1 

3 
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depicting the primary hub as being inoperational, the secondaiy hub as being operatiohal, the 
primary remote as being inoperational, the secondary remote as being operational, and the 
secondary hub communicating with the secondary remote. 

FIG. 5 is a schematic diagram of the distributed network management system of HG. 
5 1 depicting the primary and secondary remotes as being inoperational, and the primary and 
secondary hubs communicating with the network devices. 

FIG, 6 is a schematic diagram of an implementation of a distributed network 
management system according to the invention. 

FIG. 7 is schematic diagram showing an alternative embodiment of the distributed 
1 0 network management system implementation of FIG, 6 \dierein hubs are regipnalizsed. 

FIG. 8 is a functional block diagrani of the internal architecture of a remote according 
to the present invention. 

FIG. 9 is a functional block diagram of an alternative embodiment of the remote 
architecture of FIG. 8. 

16 FIG. 10 is a functional block diagram of the dNMS kernel portion of a remote 

according to the present invention. 

FIG. 1 1 is a schematic diagram of an integration server in the dNMS kernel of 
FIG. 10. 

FIG. 12 is a schenaatic diagram of a monolithic server in the dNMS kernel of 
20 FIG, 10. 

FIG- 13 is a schematic diagram showing data flow between the integration server of 
FIG, 1 1 and the monolithic server of FIG. 12. 

FIG. 14 is a schematic diagram depicting traffic flow between hubs and remotes 
through queuing according to the invention. 
25 DETAILED DESCRIPTION OF THE INYENTION 

Referring more specifically to the drawings, for illustrative purposes the present 
invention is embodied in the components, system and me&ods generally shown in FIG. 1 
through FIG. 14. It will be appreciated that the invention may vary as to configuration and 
details without departing from the basic concepts as disclosed herein. 
30 FIG. 1 is a schematic diagram of the high level architecture 1 0 of an embodiment of a 

distributed network management system according to the present invention. In the 
embodiment shown, the system comprises a primary hub 12 and a seck>ndary hub 14, both of 

4 



BNSDOCID: <WO_oa03211AlJ_> 



wo 02/03211 PCT/USOO/23728 

which can commumcate with a primary remote 16 and a secondary remote 1 8. Tlie remotes 
in turn conmiunicate with a specific set of devices 20 on nodes 22 of the netwodc 24, such as 
routers, to monitor network status. The network may be all or a portion of flie Internet or 
other wide area network. The set of network devices is selected to provide an overall 
5 representation of the network being monitored. 

Each hub is in active communication with the other hub through a full-time 
communications link 26 for redundancy, so that data received from one hub is continuously 
propagated to the other. Similarly, each remote is in active communication with the other 
remote through a full-time communications link 28 for redimdancy and for continuously 

1 0 propagating data to the other remote. In addition, each remote is in constant communication 
with each network device. Howcvct, each remote preferably monitors the network devices 
independent of the other remote. As a result, the data acquired by a remote may disagree with 
the data acquired by the other remote, even though both remotes are momtoring the same 
network devices. Becausetheremotesoperateindependentiy of each other, the monitoring 

1 5 times could be difTerent and a particular remote may observe a network condition that was not 
observed by the other remote. For example, one remote may monitor conditions thirty 
seconds into each minute, while another remote may monitor conditions forty-five seconds 
into each minute. 

Primary hub 12 is in full-time communication with primary remote 16 through 
20 communication link 30 so that changes detected by primary remote 16 is continuously 

propagated to primary hub 12 as well as to secondary hub 14 through primary hub 12. In 
addition, configuration data such as which network devices to monitor can be propagated to 
. primary remote 16 and to secondary remote 18 through primary remote 16. Note, however, 
that there is also a normally inactive communication link 32 between secondary hub 14 and 
25 secondary remote 1 8, a normally inactive communications link 34 between secondary hub 14 
and primary remote 16, and a normally inactive communications link 36 between primary 
hub 12 and secondary remote 1 8. These communications links are not necessarily direct 
physical links, however. In tiie preferred embodiment of the invention, each remote and 
network device has an address, such as an Internet Protocol (EP) address. This allows the 
30 remote or network device to be accessed over a network such as, for example, the Internet. In 
addition, each hub can commtmicate directiy with a network device as well. 

With the architecture described above, the preferred cormnunications hierarchy is as 

5 
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follows: 

1. if the primary hub and the primaxy remote are operational, tibe primary hub 
communicates with the primary remote as shown in FIG. 1. 

2. ifthe primary hub is operational, the primary remote is inoperational, and the 
5 secondary remote is operational, the primary hub communicates with the secondary remote as 

shown in FIG. 2. 

3. ifthe primary hub is inoperational, the secondary hub is operational, and the 
primary remote is opemtional, the secondary hub communicates with the primary remote as 
shown in FIG. 3. 

10 4. ifthe primary hub is inoperational, the secondary hub is operational, the 

primary remote is inoperational, and the secondary remote is operational, the secondary hub 
communicates with the secondary remote as shown in FIG* 4. 

5. if both the primary and secondary remotes are inoperational, all active hubs 
assume moiutoring of the remote network as shown in FIG. 5. 

1 5 Referring now to FIG. 6, an example of a possible geographical configuration of a 

distributed network management system according to the invention is shown. In FIG. 6,^a 
first set of hubs 38 is shown located in tiie vicinity of Seattle and a second set of hubs 40 is 
shown located in the vicinity of New York City. Also shown are several sets of remotes 42, 
44, 46, 48, 50, 52, 54, 56, and 58, each of which monitors a portion of the overall networic 

20 Note that hubs 38 monitor remotes 42, 44, 46, and 48, while hubs 40 monitor remotes 50, 52, 
54, 56, and 58. A change of state monitored by, for example, remotes 50 will propagate to 
hubs 40 in New York City, and from hubs 40 to sister hubs 38 in Seattie so that both sets of 
hubs have the same state information. 

While the foregoing configuration is scalable, the addition of a larger number of 

25 remotes or hubs can become more complex than necessary. In that event, an additional 

monitoring layer can be added above the hubs. In tiiis way, not oxdy are remotes assigned to 
regions of the network, but hubs are assigned to regions of the network as well. For example, 
referring to FIG. 7, three regions 60, 62 and 64 are shown. Each region would include a 
primary and secondary hub that would be responsible for tiiat regioru For example, primary 

30 hub 66 and secondary hub 68 would be responsible for region A, prin^ry hub 70 and 

secondary hub 72 would be responsible for region 62, and primary hub 74 and secondary hub 
76 woxild be responsible for region 64. In turn the hubs in a particular region would be 

6 
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responsible for several sets of primary and secondary r^otes in that region, such as set 78, 
78', 78" ... in region 60, set 80, 80', 80" in region 62, and set 82, 82' and 82" in region 64, 
and each set of remotes would be responsible for a portion of the netwoiic devices therein. 
The data collected by the primary hubs in each region would be propagated to a primary hub 
5 ^gregator 84, which in turn would propagate the data to a secondary hub aggregator 86 for 
redundancy. In this way, a multi-level distributed system architecture can be achieved. 

Referring now to FIG. 1 and FIG, 8, an embodunent of the internal architecture of 
primary 16 and secondary 18 remote is shown. Each remote includes a dNMS kernel 88 that, 
in addition to other functions that will be described, acquires data from the network 24. Also 

1 0 shown is a scheduler 90, which is a plug-in service that notifies administrative personnel that 
a problem exists on the network being monitored. 

Each remote is accessible through a client terminal 92 running a browser-based 
q)plication interface. Note that data propagates from the network to each dNMS kernel 
&rough a data path 94, and lliat configuration changes received from a hub (not shown) 

1 5 propagates to each dNMS kernel through a configuration path 96. 

Optionally, the remotes can include a collector 98, which is also a plug-in service, to 
which data from the network propagates and is stored in data files 100 for billing or other 
purposes. Also shown is a modtde 1 02 for mining the stored data and a module 1 04 for 

20 collating the mined data into a central database 106 accessible by a client teiminal 108 for 
billing. The details of those components are not described herein as they do not form a part 
of the invention and are shown solely to indicate additional ways in which the data acquired 
by a remote can be used. In the event that such additional uses of the data are made, 
processing overiieadofthe remotes may increase. In that event, it is prefoied to reduce the 

25 load on flie primary remote by moving the auxiliary data collection functions into a separate 
remoteserver llOasshowninFIG. 9. The primary remote 16 is then dedicated to 
monitoring network conditions, while server 110 is dedicated to the auxiUary data collection 
fimctions. Secondary remote 1 8 can be configured as before, or unloaded in the same way. 
Notethatprimary 12 and secondary 14 hubs in FIG. 1 would be configured in the 

30 same manner as the remotes. Note also that configuration information, as well as state 

information, propagates bidirectionally between hubs and remotes and between peers (e.g-, 
hub to hub or remote to remote). 

7 
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As can be seen, fhCTefore, a critical element of a hub and a remote is the dNMS 
kernel 88, Refeiring now to FIG. 10, which shows primary remote 16 as an example, the 
high level architecture of dNMS kernel 88 comprises an integmtion server 1 12 and a 
monolithic server 1 14. Integration server 1 12 communicates with client terminal 92 and 
5 monolithic server 1 14 communicates with the network devices connected to network 24. 

In the case of a remote, state information relating to the networic devices collected by 
monolithic server 1 14 is propagated to integration server 1 12 and then propagated to the 
integration server in primary hub 12, for example. Furthermore, in the case of a remote, 
configuration information such as the IP addresses of the network devices to be monitored is 

1 0 entered into integration server 110 firom client terminal 92, firom which it propagates down to 
monolithic server 1 12 as well as propagates vqp to the integration server in primaiy hub 12. 
Alternatively, configuration information can be ent^ed into a hub, in which case the 
configuration information propagates down to integration server and the monolithic server n 
the remotes. While configuration information is entered into a dNMS kernel by a client 

1 5 terminal, state information for the network devices is acquired. In the preferred embodiment 
of the invention, state information is derived using v/bst will be referred to herein as LTP, 
wSiich is an acronym developed by the inventors herein. LTP provides for simple real time 
monitoring of network devices and their interfaces using ICMP, SNMP or a combination 
thereof, and employs a sliding window to compensate for minor interruptions in Internet links 

20 orlPtrafBc. 

In LTP according to the present invention, a polUng interval is defined during yvbich 
each ICMP server sends out a plurality of ICMP echo requests, or pings. While the polling 
interval and number of pings can vary, in the preferred embodiment ten pings are sent every 
sixty seconds, with each ping being separated by a one-second interval. The number of pings 

25 that are returned is converted to a percentage for that polling interval. 

In addition, for that same poUii^ interval, if the node is SNMP-enabled (which may 
not be the case for servers and other non-router equipment), an SNMP query is sent to tiie 
node on which the interface resides. The "operational status" of the interface is queried as to 
three possible states: "up", "down", and "unknown". An "unknown" operational status means 

30 that the SNMP request was never returned and, therefore, the system does not know the 
status. 

Using the percentage of pings returned and the SNMP status, a single number is 

8 
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generated for the polling period. This number is generated by multiplying tbe percentage of 
pings returned by a constant that is assigned depending on flie result of the SNMP query; 
namely, avalue of one if the query returned "up", a value of zero if the query returned 
"down", and a value of 0.4 if the query returned "unknown. In essence, the SNMP query 
5 returned "up", we simply use the percentage of returned ICMP packets. If the query returned 
"down", we discard the ICMP mformation and take the time period as being zero percent If 
the query returned "unknown", we assume that there is a routing problem and multiply the 
percentage ICMP packets by an arbitrary value of four tentiis (0.4). For example, if ten out of 
ten pings are returned during a polling interval, but we were mable to obtain SNMP 

1 0 information for that interface during ihsA time period, tiie ratio for that time period would be 
forty percent (40%). Table 1 shows exanq>les of various networic conditions, given different 
SNMP and ICMP values, mcluding the total ratio computed for the time period. 

Once the percentage is conqiuted in this manner, the next step is to ccnniiute a 
weighted average of the percentages for current and previous four time periods. This is 

1 5 preferably carried out by with a five elonent table with a sliding window. The percentage for 
the currait time is inserted in the ri^tmost (e.g., current period) slot. If flie current period 
slot is not empty, all values in the table are shifted to the left by one slot (i.e., tiie oldest data 
is dropped). Therefore, each position in the table represents a different time period's ratio. 
The leftmost slot contains data tiiat is four polling intervals old and, as the table is transveised 

20 to the right, the data is more recent. 

Each position in the table is also assigned a weight, which affects tiie extent to vsiiich 
that position in the table will influence the final percentage; that is, the state of the interfiice. 
Higher weights are assigned to the more recent polling intervals, as Ihey are more indicative 
of tiie current state. Note, however, that die weights should not be too high; otherwise, the 

25 result will be over-notification ofproblems with die interfiice. bi other words, if the weights 
are set too high, the nbrinal iiitermittency m the Internet will result in unnecessary 
notification. By keeping the wraghts low, some flapping of the intei:&ce is allowed without 
over notification. Therefore, the weights can vary and are typically set using empirical data. 
Table 2 shows an example of a completely filled in sliding wmdow for an interface 

30 that, vAula having an "up" operational state as far as the router is concerned, is dropping a 
considerable number of ICMP packets. Table 3 shows the relationship between the 
percentage for the pollmg period and the "total ratio" once the weights are ^plied. To arrive 

9 
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at the forty-five percent (45%) total ratio, we take all of the positions in the table into 
account The position percentage is multiplied by the wei^t for all positions to arrive at the 
resulting percentage for all positions. The resulting percentages are tiien added and divided 
by the sum of the weights. Given this total percentage, the final state of the interface is 
5 computed. Referring to Table 4, if the percentage is greater than sixty percent (60%), fhe 
interface is considered "up". If the percentage is between forty percent (40%) and sixty 
percent (60%), the state is either intermittent or unknown. However, it is unknown if and 
only if the last SNMP poll came back as "unknown" ; otherwise, it is intermittent If the ratio 
is less than forty percent (40%), the interface is "down". 

10 It can be appreciated at this point that a hub and remote each comprise software 

executing on hardware. The hardware comprises one or more conventional computers and 
associated peripherals and communications interfaces. The dNMS kernel is a software engine 
executable on a computer that is integral to a hub or a remote. Preferably, the engine is never 
modified; instead, for flexibility and scalability, the invention employs "plug-ins" to 

1 5 implement specific fimctions. A "plug-in" as the term is used herein is a software module 
that carries a unique file name. Additionally, the only information that need be changed in 
the dNMS kemel is the configuration information that controls the functioning of a plug-in 
service, such as LTP described above. The dNMS kemel sends llie configuration 
information, such as device addresses and how often a plug-in should perform a specified 

20 function on one or more devices, to the plug-ins and the monolithic server, and the monolithic 
server monitors the network devices based on the configuration information acquired by the 
plug-ins. 

Monolithic server processing according to the invention can be summarized in terms 
of nodes (e.g., routers, servers, or topological containers for the same), inter&ces (e.g., 

25 physical interfeces, IP addresses), services and notifiers. While nodes and interfiices have 
states, neither a node nor an interface knows how to determine its own state. Nodes and 
interfaces only have states because they are associated with services that have a state. 
Therefore, state information is derived firom services; namely, an action performed on a node 
or interface that returns information. A service has a state by definition and is the only object 

30 that determines state on its own. An example of a service, as described above, is LTP. 

In the present invention, a notifier is a plug-in that routes state information to another 
service, such as schedider 90 in FIG. 8. If a service has detemuned that a change of state has 
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taken place, a notifier is called. Therefore, a notifier is caUed^en the state of a service is 
changed. In contrast, states of inter&ces and nodes are determined by their ovmcd services, 
but a notifier is not called v^en the state of an inter&ce or node changes. Note, however, that 
generally speaking the state change of a service will cause a change of state for the 
5 corresponding interface or node. 

Note, however, that the state of an interface is defined as the worst state of any of its 
services, and that the state of a node is defined as the worst state of its interfaces, sub-nodes, 
and services. This means that a state change of a node or an interface is dictated by a 
downstream state change, which may not represent all objects on that node or interface. 

10 Accordingly, to manage the amount of notifications resulting from state changes on a node or 
interfece, the present invention employs a "toggle notification flag" associated with nodes and 
interfaces. By settmg the flag, an object vwU be ignored in an i5)stream state detertn^ 
For example, if a node contains multiple interfaces, the state of one or more of the interfaces 
can be ignored for purposes ofdetemiining the state of the node. Notifiers are not called for 

15 interfaces or nodes vvho have their "toggle notification flag" set. 

Referring now to FIG. 11 and FIG. 12, the preferred embodiment of the lower level 
architecture of dNMS kernel 88 is shown. At tiie outset, it should be noted that this 
architecture is common to all dNMS kernels, whether they reside in a hub or a remote. In 

20 FIG. 1 1 , the architecture of the integration server is shown, while the architecture of the 

monolithic server is shown in FIG. 12. Note that the basic architecture is the same; however, 
the functions are different 

A primary function of integration server 1 1 2 is to manage the configuration 
information for the network it is configm^i to represent, such as network 24. Anintegration 

25 server includes "placeholders" for each of the pli]^-in services, with each placeholder havmg 
a unique name that corresponds to the plug-in service that monitors the network. These 
placeholders are not operational services, however they only represent configuration 
information that is passed to operational plug-ins located in monolithic server 1 14. The 
integration server manages this configuration mformation since it is connected to other 

30 integration servers in other dNMS kernels and, as discussed previously, configuration 

information propagates bidirectionally through the system. Therefore, the integration servers 
manage and route the configurations of all of the monitoring and collection services for the 

11 
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distributed network management system of flie invention. 

The monolilliic sCTver shares the same architecture of the integration server as can be ' 
seen in FIG. 12, Here, however, the services are operational and determine the state of 
downstream objects on the network. Note that the mraibers and types of services are not 
5 limited. One such SCTvice is LTP as described above. Other services include, but are not 
limited to, monitoring bandwidth thresholding, temperature, power supply status, disk space, 
and enAoronmental conditions. The system may optionally include one or more utility 
modules, such as an auto discovery module that knows how a router works and can talk to 
router to automatically add interfaces. Essentially, any software module that is not in the 

10 dNMS kernel itself can be "plugged-in" to the dNMS kernel to provide a service. 

As indicated previously, each service has a unique identification (e.g., s^vice or fiULe 
name). Referring to FIG. 13, these identifiers pemut the integration server and monolithic 
server to communicate through a conduit 1 16, which is an internal bus or other 
communications link. This allows state information fix>m the monolithic server to be 

1 5 propagated to the corresponding service placeholder in the integration server for fiirther 
propagation to another dNMS kernel. It also allows for configuration information to be 
propagated fi-om the integration server to the monolithic server, v^ether the configuration 
information originates firom flie same or a different dNMS kernel (e.g., firom the hub or 
remote in which the dNMS kernel resides, or firom another hub or remote). 

20 It will be appreciated that assigning a unique identifier to every service also allows for 

dNMS kernel to dNMS kernel cormnunication. In addition to every service having a unique 
identifier, each identifier has a relative timestamp that denotes the last time that tiie service 
was changed. For example, when a "change" message such as an "add service" message is 
transmitted it would indicate that the change was made one-thousand (1000) seconds ago. 

25 This helps resolve time-based synchronization problems. 

Note also that every attribute type for the various objects has a change message type, 
such as polling rate, node name, etc. The reason for the time stamping is that, if two changes 
for the same attribute of the object are received, the most recent is used. More simply, if a 
more recent type change is received than what is currently recorded, the more recent 

30 information is kept instead. Note that the sender of the change does not care how the 
recipient handles the message, only that it was received. 

Referring to FIQ. 14, the invention also includes a mechanism to control traffic 

12 
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between hubs and remotes. Each time a change message is sent, it is placed into a queue. For ' 
example, primary remote 16 sends a messj^e to primary hub 12 throu^ queue 118, and 
messages from primary hub 12 to primary remote 1 6 are sent ihrougli queue 120. The 
message is then sent to the appropriate recipient When the recipient acknowledges receipt, 
5 the message is dropped out of the queue. If the recipient does not have sufficient storage to 
accept the message, it will not send an acknowledgement In that event, the message will stay 
in the queue indefinitely until an acknowledgment is received. For example, a remote coxild 
keep the message in the queue and not take the message until it has room to receive the 
message. Note that there are two reasons for a hub or remote to send a change message; 

10 when that hub or remote generates the change message, or v/h&n propagating a change 
message for another hub or remote. An example would be where a secondary remote 
generates a change message. The secondary remote would send it to the primary remote and, 
in turn the primary r^bte would propagate it up to a hub. 

The use of queues and acknowledgement controls will also keep the hubs from 

1 5 becoming overloaded when all or a part of the system returns fix>m a system &ilure. Suppose, 
for example, that a secondary hub comes on line after a &ilure and thinks that it last received 
charge information from the primary hub flurty (30) seconds ago. Also assume that the 
primary hub thinks that it last spoke to the secondary hub twelve-hundred (1200) seconds 
ago. In this instance, the primary hub would send a batch change representing a list of all 

20 changes in the past twelve-hundred (1200) seconds to the secondary hub, since that is the 

oldest timestamp. This can occur in either direction. The queues exist to accommodate batch 
transactions, raflier than real-time transactions. 

Another aspect of the invention involves knowing if a peer is operational; for 
example, aprimary hub knowing that its corresponding secondary hub is operational and vice 

26 versa. In the present invention, this is not detennined simply by testing connectivity Here, 
all systems connected to each other send "keep alive" signals at specified intervals and look 
for "keep alive" signals from their peers at specified mtervals. For example, every forty (40) 
seconds a "keep alive" signal is sent from the primary hub to the secondary hub. If a "keep 
alive" signal is not received by the secondary hub within one-hundred and eighty (1 80) 

30 seconds, the primary hub is considered to be down. Additionally, if a system tries to 

communicate with its peer, but camiot, the peer is deemed to be down. Other pollmg periods 
could be used, but the foregomg empirically have been found to provide the best results. 
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Also, with regard to the anatomy of a message, each message includes a unique 
identifier, a timestamp, a change ^e (e.g., node add, node remove, IP address); message BD, 
and information specific to the change type (e.g,, node name or IP address). To prevent 
looping in tiie system, each time a system sends a message it puts a host name in the message 
5 and will never send a message to a system whose name is already in the message. 

Lastly, it will be appreciated by those skilled in the art that a possible system 
configuration might involve monitoring a plurality of devices through one physical cable to 
all devices. In the event that the cable becomes inoperational, each of those devices may be 
reported as being inoperational. To reduce the need for "redxmdant" reporting of multiple 

1 0 devices ejqjeriencing an outage yvh&a the outage is due to a cable or other common device 
being inoperational, we can collate all devices into one and simply report that the common 
interface is inoperational. 

Although the description above contains many specificities, these should not be 
construed as limitin g the scope of the invention but as merely providing illustrations of some 

15 of the presentiy preferred embodiments of this invention. Thus the scope of this invention 
should be detenmned by the appended claims and their legal equivalents. Therefore, it will 
be appreciated that the scope of the present invention fully encompasses other embodiments 
which may become obvious to those sldlled in the art, and that the scope of the present 
invention is accordingly to be limited by nothing other than the appended claims, in which 

20 reference to an element in the singular is not intended to mean "one and only one" xmless 
explicitiy so stated, but rather "one or more." All stractural, chemical, and functional 
equivalents to the elements of the above-described preferred embodiment that are known to 
those of ordinary skill in the art are expressly incoiporated herein by reference and are 
intended to be encompassed by the present claims. Moreover, it is not necessary for a device 

25 or method to address each and every problem sought to be solved by the present invention, 
for it to be encompassed by the present claims. Furthermore, no element; component, or 
method step in the present disclosure is intended to be dedicated to the public regardless of 
whether the element, component, or method step is explicitiy recited m the claims. No claim 
element herein is to be constraed under the provisions of 35 U.S.C. 112, sbrfh paragraph, 

30 unless the element is expressly recited using the phrase "means for." 
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Table 1 

Examples for filling out one entry in the LTP sliding window 



SINGLE 
ROW FROM 
LTP 
VIEWER 




DESCRIPTION OF 
SITUATION 


ICMP 

PERCENTAGE 
RECEIVED 


SNMP 
STATUS 


RESULTINO 
PERCENTAGE 
FOR TIME 
PERIOD 


-4 min 
(100%) 


up 


normal up 
interfiice, passing 
traffic (100% ICMP 
X 1 

SNMP= 100%) 


100% 


up(lx) 


100% 


-4 min (0%) 


down 


normal down 
interlace, not 
passing anyfliing 
(0%ICMPxO 
SNMP«100%) 


0% 


down (Ox) . 


0% 


•4 min (40%) 


up 


major packet loss to 

int^&ce^ but 
interfiiceisstillup 

(40% ICMP xl 
SNMP = 40%) 


40% 


up(lx) 


40% 


-4 min (36%) 


snmp- 
unknown 


int^r&ce passing 
most traffic, but 
problem gatiiering 
snmp info (likely 
an snmp-renumber 
issue) (90% ICMP 
X .4 SNMP-36%) 


90% 


unknown 

(no 
lesponse) 
(.4x) 


36% 


-4 min (0%) 


down 


routing probl^ 
causing pings to go 
tfarough anyway, 

evCT through 
inter&ce is down 

(or, an snmp- 
renumber issue) 
(60%ICMPxO 

SNMP=0%) 


60% 


down (Ox) 


0% 


-4 min 
f100%"k 


undefined 


normal pings on an 

UilwllAww WiUl uu 

SNMP (web savor, 
etc.), (70% 
ICMP=70%) 


70% 




70% 


-4miii(-) 


up 


snmp-only 
monitoring of un- 
numbered inter&ce, 
no ICMP status at 
aU(lSNMP = 
100%) 




up(lx) 


100% 
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Table2 

Example output for entire window of data 



TIME PERIOD 


PERCENTAGE 


SNMP STATE 


WEIGHT 


-4imn 


33% 


up 


2x 


-3 min 


33% 


up 


2x 


-2 min 


0% 


up 


3x 


-1 min 


100% 


up 


3x 


Omin 


50% 


up 


4x 



5 

Total ratio calculation for LTP view in Table 2 



1 PERCENTAGE RECEIVED FOR TIME PERIOD 


WEIGHT 


RESULTING 
PERCENTAGE 


33% 


2x 


+66% 1 


33% 


2x 


+66% 


0% 


3x 


+0% 


100% 


3x 


+300% 


50% 


4x 


+200% 






632%/ 14 = 45% 



Table 4 

10 M£^ing of total ratio i>6rcentage to final state of LIP 



TOTAL RATIO LEVEL 


RESULTING STATE | 


ratio < 40 


down 


40<ratio<60 


unknown or intermittent 


ratio > 60 


up 
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C3LAIMS 

What is claimed is: 

1 . A distributed network management system, comprising: 

(a) a hub server; and 

(b) a remote server; 

(c) said remote server capable of conamunicating with a network device and said 

hub; 

(d) said hub server capable of communicating with said remote server and said 
network device; 

(e) wherein 

(i) if said hub server and said remote server are operational, said hub 
server communicates with said remote servar, and 

(ii) if said hub server is operational and said remote server is inopemtlonal, 
said hub server communicates with said network device. 



2. A distributed network management syston, comprising: 

(a) a primary hub server; 

(b) a secondary hub server; and 

(c) a remote server; 

(d) said remote server capable of communicating with a network device, said 
primary hub server and said secondary hub server; 

(e) said primary hub server c£5>able of commmiicating with said remote server and 
said secondary hub server; 

(f) said secondary hub server capable of communicating with said remote server 
and said primary hub server; 

(g) v^erein 

(i) if said primary hub server and said remote server are operational, said 
primary hub server communicates with said r^ote server, and 

(ii) if said primary hub server is inoperational, said secondary hub server is 
opemtional, and said remote server is operational, said secondary hub server 
conmixmicates with said remote server. 
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3. A system as recited in claim 2, wlierein said primary hub server is capable of 
commimicating ivifh said netwoik device, and wherein if said primary hub server is 
operational and said remote server is inoperational, said primary hub server communicates 

5 with said network device. 

4. A system as recited in claim 2, wherein said secondary hub server is capable of 
conmiunicating with said network device, and wherein if said primary hub server is 
inoperational and said remote server is inoperational, said secondary hub server 

1 0 communicates with said network device. 

5. A distributed network management system, comprising: 

(a) a hub server; 

(b) aprimary remote servei^ and 
15 (c) a secondary remote server; 

(e) said primary remote server capable of communicating vsdth a remote netwcurk, 
said secondary remote server, and said hub server; 

(f) said secondary remote server capable of communicating with said remote 
network^ said primary remote server, and said hub server; 

20 (g) said hub server capable of communicating with said piimaiy remote server and 

said secondary remote server; 
(h) wherein 

(i) if said hub server and said primary remote server are operational, said 
hub server communicates with said primary r^ote server, and 
25 (ii) if said hub server is operational, said pritnary remote server is 

inoperational, and said secondary remote server is operational, said hub server 
communicates with said secondary remote server. 

6. A system as recited in claim 5, wherein said hub server is capable of 

30 communicating with said network, and wherein if said hub server is operational and said 
primary and said secondary remote servers are inoperational, said hub server communicates 
with said network device. 

18 
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7. A distributed netwoik management system, comprising: 
(a) a primaiy hub server; 
5 (b) a secondary hub SCTver; 

(c) a primary remote server; and 

(d) a secondary remote server; 

(e) said primary remote server capable of communicating with a remote networl^ 
said secondary remote server, said primary hub server and said secondary hub server; 

10 (f) said secondary remote server capable of communicating with said remote 

network, said primary remote server, said primary hub server and said secondary hub server; 

(g) said primary hub server capable of communicating with said secondary hub 
sCTver, said primary remote server, said secondary remote server, and said remote networic; 

(h) said secondary hub server capable of communicating with said primary hub 
1 5 server, said primary remote server, said secondary remote server, and said remote network; 

(i) wherein 

(i) if said primary hub SCTver and said primary remote server are 
operational, said primary hub server communicates with said primary remote server, 

(ii) if said primary hub server is operational, said primary remote server is 
20 iuoperational, and said secondary remote server is op^:ational, said primary hub 

server communicates with said secondary remote server, 

(iii) if said primaiy hub server is operational and said primary and 
secondary r^ote servers are inoperational, said primary hub server communicates 
with said remote network, 

25 (iv) if said primary hub server is inoperational, said secondary hub server is 

operational, and said primary remote server is operational, said secondary hub server 

communicates with said primary remote server, 

(v) if said primary hub server is inopemtional, said secondary hub server is 

operational, said primary remote server is inoperational, and said seicondary remote 
30 server is operational, said secondary hub server communicates witii said secondary 

remote server. 
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(vi) if said primary hub server is inoperational, said secondary hub server 
is operational, and said primary and secondary rCToiote servers are inoperational, said 
secondary hub server communicates ^tfa said remote network, and 

(vii) if said primary hub server is operational, said secondary hub server is 
5 operational, and said primary and secondary remote servers are inoperational, said 

primary hub server and said secondary hub server commimicate with said remote 
network. 

8. A distributed network manag^ent system, comprising: 
10 (a) a hub server; 

(b) a remote server; 

(c) said remote server ccqpable of communicating with a network device and said 

hub; 

(d) said hub server c£^able of communicating with said remote server and said 
1 5 network device; and 

(e) inx)gramming associated with at least one of said servers for carrying out the 
operations of 

(i) if said hub server and said remote server are operational, causing said 
hub server to communicate with said remote server, and 
20 (ii) if said hub server is operational and said remote server is inoperational, 

causing said hub server to communicate with said network device. 

9. A distributed network management system, comprising: 
(a) a primary hub server; 

25 (b) a secondary hub server; 

(c) a remote server; 

(d) said remote server capable of communicating with a network device, said 
primary hub server and said secondary hub server; 

(e) said primary hub server capable of communicating with said remote server and 
30 said secondary hub server; 

(f) said secondary hub server capable of communicating with said remote server 
and said primary hub server; and 

20 
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(g) programming associated with at least one of said servers for carrying out the 
operations of 

(i) if said primary hub server and said remote server are operational, 
causing said primary hub server to communicate with said remote server, and 
5 (ii) if said primary hub SCTver is inoperational, said secondary hub server is 

operational, and said remote server is operational, causing said secondary hub server 
to communicate with said remote server. 

10* A system as recited in claim 9, wherein said primary hub server is capable of 
10 communicating with said network device, and further comprising programming for carrying 
out the operation of causing said primary hub server to communicate with said network 
device if said primary hub server is operational and said remote server is inoperational. 

11. A system as recited in claim 9, wherein said secondary hub s^er is capable of 
1 5 conununicating with said network device, and further corq)rising progranmiing for carrying 

out the operation of causing said secondary hub servCT to communicate with said network 
device if said primary hub server is inoperational and said remote server is inoperational. 

12. A distributed network management system, comprising: 
20 (a) a hub server; 

(b) a primary remote server; 

(c) a secondary remote server; 

(e) said primary reinote server capable of commutucating with a remote ne^^ 
said secondary remote server, and said hub server; 

25 (f) said secondary remote server capable of conmiuiiicatirigvdth said remote 

network, said primary remote server, and said hub server; 

(g) said hub server capable of communicating with said primary remote server and 
said secondary remote serv^; and 

(h) programming associated with at least one of said servers for carrying out the 
30 operations of 

(i) if said hub server and said primary remote server are operational, 

21 
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causing said hub server to communicate with said primary remote server, and 
(ii) if said hub serv^ is operational, said primary remote server is 
inoi)erational, and said secondary remote servo: is operational, causing said hub server 
to communicate with said secondary r^ote server. 

5 

13. A system as recited in claim 12, wherein said hub server is capable of 
communicating with said network, and further comprising programming for carrying out the 
operation of causing said hub server to communicate with said network device if said hub 
server is operational and said primary and said secondary remote servers are inoperational. 

10 

14. A distributed network management system, comprising: 

(a) a primary hub server; 

(b) a secondary hub server; 

(c) a primary remote server; 
15 (d) a secondary remote server; 

(e) said primary remote SCTver capable of conummicating vsdth a remote network, 
said secondary remote server, said primary hub server and said secondary hub server; 

(f) said secondary remote server capable of communicating with said r^iote 
network, said primary remote server, said primary hub server and said secondary hub server; 

20 (g) said primary hub server capable of communicating with said secondary hub 

server, said primary remote server, said secondary remote server, and said remote network; 

(h) said secondary hub server capable of communicating with said primary hub 
SCTver, said primary remote server, said secondary remote server, and said remote network; 
and 

25 (i) progranuning associated with at least one of said servers for carrying out the 

operations of 

(i) if said primary hub server and said primary remote server are 
operational, causing said primary hub server to communicate with said primary 
remote server, 

30 (ii) if said primary hub server is operational, said primary remote server is 

inoperational, and said secondary remote server is operational, causing said primary 
hub server to communicate with said secondary remote server, 
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(iii) if said primary bub server is operational and said pri^^ 
secondary remote servers are inoperalional, causing said primsiiy hub server to 
communicate vath said remote network, 

(iv) if said primary hub server is inoperational, said secondary hub server is 
5 operational, and said primary remote server is operational, causing said secondary hub 

server to communicate with said primary remote server, 

(v) if said primary hub server is inoperational, said secondary hub server is 
operational, said primary remote server is inoperational, and said secondary remote 
server is operational, causing said secondary hub server to communicate with said 

1 0 secondary remote server, 

(vi) if said primary hub server is inoperational, said secondary hub server 
is operational, and said primary and secondary remote servers are inoperational, 
causing said secondary hub server to communicate with said remote network, and 

(vii) if said primary hub server is operational, said secondary hub server is 
15 operational, and said primary and secondary remote servers are moperational, caxising 

said primary hub server and said secondary hub server to communicate with said 
remote network. 



15. A distributed network management system, comprising: 
20 (a) a hub server; and 

(b) a remote server; 

(c) said remote server capable of communicating with a network device and said 
hub servei^ 

(d) wherein configuration parametera for said remote server to commimicate with 
25 said network deAdce can be propagated between said hub server and said remote server 

bidirectionally. 



30 1 6, A distributed network management system, comprising: 

(a) a network server capable of communicating with a network device; and 

(b) means associated with said network server for deriving state information fi-om 
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17. A system as recited in claim 16, wherein said LTP comprises: 
(a) defining a polling interval; 

5 (b) sending, from an ICMP server, a plurality of pings to an interface address on 

said network device during said polling interval; 

(c) monitoring the number of pings returned from said network device and 
converting said niraiber to a percentage based on the nimiber of pings sent; 

(d) sending an SNMP query to said network device and determining operational 

1 0 status of said network device from said SNMP queiy, said operational status comprising "up", 
"down", and "unknovm"; 

(e) using flie pox^ntage of pings returned and the SNMP status, generating a 
status percentage for the polling i)eriod by multiplying the percentage pings returned by a 
constant value associated with said operational status, said constant value comprising a first 

1 5 value if the operational status is "up", a second value if the opa:ational status is down", and a 
third value if the operational status is "unbaown"; and 

(f) computing a weighted average of the status percentages for current and 
previous four polling periods and determining the stale of the network device from the 
weighted average. 

20 

18. A system as recited in claim 1 6, further comprising: 

(a) means for defining a polling interval; 

(b) means for sending, from an ICMP servCT, a plurality of pings to an interface 
address on said network device during said polling interval; 

25 (c) means for monitoring the number of pings returned from said network device 

and converting said number to a percentage based on the number of pings sent; 

(d) means for sending an SNMP query to said network device and determining 
operational status of said network device from said SNMP query, said operational status 
comprising "up", "down", and "unknown"; 

30 

(e) means for using the percentage of pings returned and the SNMP status, 
generating a status percentage for the polling period by multiplying the percentage pings 
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returned by a constant value associated with said operational status, said constant value 
comprising a first value if the operational status is "up", a second value if the operational 
status is down", and a third value if the operational status is "unknown"; and 

(f) means for computmg a weighted average of the status percratages for current 
and previous four polling periods and determining the state of the network device from the 
weighted average. 

19. A system as recited in claim 1 6, further comprising programming associated 
with said network server for carrying out the functions of: 

(a) defining a polling interval; 

(b) sending, from an ICMP server, a plurality of pings to an interface address on 
said network device during said polling interval; 

(c) monitoring the number of pings returned from said network device and 
converting said number to a percentage based on the number of pings sent; 

(d) sending an SNMP query to said network device and determining operational 
status of said network device from said SNMP query, said operational status comprising "up", 
"down", and "imknown"; 

(e) using the percentage of pings retumed and the SNMP status, generating a 
status percentage for the polling period by multiplying the percentage pings retumed by a 
constant value associated with said operational status, said constant value comprising a first 
value if the opCTational status is "up", a second value if the operational status is down", and a 
third value if the operational status is "unknown"; and 

(f) confuting a weighted averse of the status p^centages for current and 
previous four polling p^ods and determining the state of the network device from the 
weighted average. 

20. A system for deriving state information from a network device, comprising: 

(a) a computer; and 

(b) prograrmning associated with said computer for carrying out the operations of 

(i) defining a polling interval; 

(ii) sending, from an ICMP server, a pluraHty of pings to an interface 
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address on said network device during said polling interval; 

(iii) monitoring the number of pings retumed from said networic device and 
converting said number to a percentage based on the number of pings sent; 

(iv) sending an SNMP query to said network device and determining 

5 operational status of said network device from said SNMP query, said operational 

status comprising "up", "down", and "unknown"; 

(v) using the percentage of pings retumed and the SNMP status, 
generating a status percentage for the polling period by multiplying the percentage 
pings retumed by a constant value associated with said operational status, said 

1 0 constant value comprising a first value if flie operational status is "iq)", a second value 

if fhe operational status is down", and a third value if the operational status is 
"unknown"; and 

(vi) computing a weighted average of ttie status percentages for current and 
previous four polling periods and determining the state of the network device ftom the 

1 5 weighted average. 

21. A method for distributed network management, comprising: 

(a) providing a hub server; 

(b) providing a remote server; 

20 (c) said remote server capable of communicating with a network device and said 

hub; 

(d) said hub server capable of communicating with said remote server and said 
network device; 

(e) if said hub server and said remote server are operational, causing said hub 
25 server to communicate with said remote server; and 

(f) if said hub server is operational and said remote server is inoperational, 
causing said hub server to conunxmicate with said network device. 

22. A method for distributed network management, comprising: 
30 (a) providing a primary hub server; 

(b) providing a secondary hub server; 

(c) providing a remote server; 
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(d) said remote server capable of communicatiiig with a network device, said 
primaty hub server and said secondary hub serva; 

(e) saidpriniiary hub server capable of coinmiiniGating with said remote se^ 
said secondary hub server; 

5 (f) said secondary hub server capable of communicating with said remote server 

and said primary hub server; 

(g) if said primary hub server and said remote server are operational, causing said 
primary hub server to communicate with said remote server, and 

(h) if said primary hub server is inoperational, said secondary hub server is 
1 0 operational, and said remote server is operational, causmg said secondary hub server to 

communicate with said remote server. 



23. A system as recited in claim 22, vdierein said piimaiy hub server is capable of 
communicating with said network device, and further comprising causing said primary hub 

1 5 server to conmiunicate with said network device if said primary hub server is operational and 
said remote server is inoperational. 

24. A system as recited in claim 22, wherein said secondary hub server is capable 
of communi eating with said network device, and further comprising causing said secondary 

20 hub server to communicate with said network device if said primary hub SCTver is 
inoperational and said remote server is inoperational. 

25. A method for distributed network management, comprising: 
(a) providing a hub server; 

25 (b) providing a primary remote servw; 

(c) providing a secondary remote serven 

(e) said primary remote server capable of communicating with a remote network, 
said secondary remote server, and said hub server; 

(f) said secondary remote server capable of communicating with said remote 
30 network, said primary remote server, and said hub server; 

(g) said hub server capable of connnunicating with said primary remote server and 
said secondary remote server; 

27 



BNSOOCID: <WO_oa03211AlJ_> 



wo 02/03211 PCT/USOO/23728 

(h) if said hub server and said primary remote server are operational, causing said 
hub server to communicate with said primary remote server; and 

CI) if said hub server is operational, said primary rraiote server is inoperotional, 
and said secondary remote server is operational, causing said hub server to communicate with 
said secondary remote server. 

26. A method as recited in claim 25, wherein said hub server is capable of 
communicating with said network, and further comprising causing said hub server to 
communicate with said network device if said hub server is operational and said primary and 
said secondary remote servers are inopemtional. 

27. A method for distributed network management, comprising: 

(a) providing a primary hub s^er; 

(b) providing a secondary hub server; 

(c) providing a primary remote server; 

(d) providing a secondaiy remote server; 

(e) said primary remote server capable of communicating with a remote network, 
said secondary remote server, said primary hub server and said secondary hub server; 

(f) said secondary remote server capable of communicating with said remote 
network, said primary remote server, said primary hub server and said secondary hub server; 

(g) said primary hub server capable of communicating with said secondary hub 
server, said primary remote server, said secondary remote server, and said remote networl^ 

(h) said secondary hub server enable of communicating with said primary hub 
server, said primary remote server, said secondary remote server, and said remote network; 

(i) if said primary hub servo: and said primary remote server are Ojperational, 
causing said primary hub server to communicate with said primary remote server; 

(j) if said primary hub server is operational, said primary remote server is 
inoporational, and said secondaiy remote server is operational, causing said primary hub 
server conununicates with said secondaiy remote servCT; 

(k) if said primary hub server is oporational and said primary and secondary 
remote servers are inoperational, causing said primary hub server to communicate with said 
remote network; 
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(1) if said primary hub server is inoperadonal, said se^ 
operational, and said primary remote server is opCTational, causing said secondary hub server 
to communicates with said primary remote server; 

(m) if said primary hub server is inop»ational, said secondary hub server is 
5 operational, said primary remote server is inoperational, and said secondary remote server is 
operational, causing said secondary hub server to communicate with said secondary remote 
server; 

(n) if said primary hub server is inoperational, said secondary hub server is 
operational, and said primary and secondary remote servers are inoperational, causing said 
1 0 secondary hub SOTver conamimicates with said remote network; and 

(o) if said primary hub server is operational, said secondary hub server is 
operational, and said primary and secondary remote servers are inoperational, causing said 
primary hub server and said secondary hub server to communicate witii said remote network. 

15 28. A method for distributed network management, comprising: 

(a) providing a hub server; 

(b) providing a remote server; 

(c) said remote server capable of communicating with a network device and said 
hub server; and 

20 (d) propagating configmration parameters for said remote server to communicate 

with said network device between said hub server and said remote server bidirectionally . 

29. A method for distributed network managCTient, comprising: 

(a) providing a network server capable of commimicating with a network device; 



25 and 



(b) deriving state information from said network device using LTP. 



30. A method as recited in claim 29, vsdierein said LTP comprises: 
30 (a) defining a polling interval; 

(b) sending, from an ICMP server, a plurality of pings to an interface address on 
said network device during said polling interval; 
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(c) monitoring fhe number of pings returned from said network device and 
converting said number to a percentage based on the number of pings sent; 

(d) siding an SNMP query to said network device and determining operational 
status of said network device fi'om said SNMP query, said operational status comprising *'up'*, 

5 "down", and "unknown"; 

(e) using the percentage of pings returned and the SNMP status, generating a 
status percentage for the polling period by multiplyiag the percentage pings returned by a 
constant value associated with said operational status, said constant value comprising a first 
value if the operational status is "up", a second value if the operational status is down", and a 

1 0 third value if the operational status is "unknown"; and 

(f) computing a weighted average of the status percentages for current and 
previous four polling periods and determining fhe state of the network device from tiie 
weighted average. 

15 31. A method for deriving state information from a network device, comprising: 

(a) defining a polling interval; 

(b) sending, from an ICMP server, a plurality of pings to an interface address on 
said network device during said polling interval; 

(c) monitoring the number of pings retumed from said network device and 
20 converting said number to a percentage based on the number of pings sent; 

(d) sending an SNMP query to said network device and determining operational 
status of said network device from said SNMP query, said op^tional status comprising "up", 
"down", and "unknown"; 

(e) iising the percentage ofpings retumed and the SNMP status, generatiiig a 
25 status p^entage for the polling period by multiplying the percentage pings retumed by a 

constant value associated with said operational status, said constant value comprising a first 
value if the operational status is "up", a second value if tiie operational status is down", and a 
tinrd value if the operational status is "unknown"; and 

(f) computing a weighted average of the status percentages for currmt and 
30 previous four polling periods and determining the state of the network device from the 

weighted average. 
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